23 research outputs found

    Fast Estimation of Multinomial Logit Models: R Package mnlogit

    Get PDF
    We present the R package mnlogit for estimating multinomial logistic regression models, particularly those involving a large number of categories and variables. Compared to existing software, mnlogit offers speedups of 10 - 50 times for modestly sized problems and more than 100 times for larger problems. Running in parallel mode on a multicore machine gives up to 4 times additional speedup on 8 processor cores. mnlogit achieves its computational efficiency by drastically speeding up computation of the log-likelihood function's Hessian matrix through exploiting structure in matrices that arise in intermediate calculations. This efficient exploitation of intermediate data structures allows mnlogit to utilize system memory much more efficiently, such that for most applications mnlogit requires less memory than comparable software by a factor that is proportional to the number of model categories

    Multivariate-From-Univariate MCMC Sampler: The R Package MfUSampler

    Get PDF
    The R package MfUSampler provides Markov chain Monte Carlo machinery for generating samples from multivariate probability distributions using univariate sampling algorithms such as the slice sampler and the adaptive rejection sampler. The multivariate wrapper performs a full cycle of univariate sampling steps, one coordinate at a time. In each step, the latest sample values obtained for other coordinates are used to form the conditional distributions. The concept is an extension of Gibbs sampling where each step involves, not an independent sample from the conditional distribution, but a Markov transition for which the conditional distribution is invariant. The software relies on proportionality of conditional distributions to the joint distribution to implement a thin wrapper for producing conditionals. Examples illustrate basic usage as well as methods for improving performance. By encapsulating the multivariate-from-univariate logic, package MfUSampler provides a reliable package for rapid prototyping of custom Bayesian models while allowing for incremental performance optimizations such as taking advantage of conditional independence, and high-performance implementation of function evaluations. Utility functions for MCMC diagnostics as well as sample-based construction of predictive posterior distributions are provided in MfUSampler

    Bayesian, and Non-Bayesian, Cause-Specific Competing-Risk Analysis for Parametric and Nonparametric Survival Functions: The R Package CFC

    Get PDF
    The R package CFC performs cause-specific, competing-risk survival analysis by computing cumulative incidence functions from unadjusted, cause-specific survival functions. A high-level API in CFC enables end-to-end survival and competing-risk analysis, using a single-line function call, based on the parametric survival regression models in the survival package. A low-level API allows users to achieve more flexibility by supplying their custom survival functions, perhaps in a Bayesian setting. Utility methods for summarizing and plotting the output allow population-average cumulative incidence functions to be calculated, visualized and compared to unadjusted survival curves. Numerical and computational optimization strategies are employed for efficient and reliable computation of the coupled integrals involved. To address potential integrable singularities caused by infinite cause-specific hazards, particularly near time-from-index of zero, integrals are transformed to remove their dependency on hazard functions, making them solely functions of causespecific, unadjusted survival functions. This implicit variable transformation also provides for easier extensibility of CFC to handle custom survival models since it only requires the users to implement a maximum of one function per cause. The transformed integrals are numerically calculated using a generalization of Simpson's rule to handle the implicit change of variable from time to survival, while a generalized trapezoidal rule is used as reference for error calculation. An OpenMP-parallelized, efficient C++ implementation - using packages Rcpp and RcppArmadillo - makes the application of CFC in Bayesian settings practical, where a potentially large number of samples represent the posterior distribution of cause-specific survival functions

    Stochastic Newton Sampler: The R Package sns

    Get PDF
    The R package sns implements the stochastic Newton sampler (SNS), a MetropolisHastings Markov chain Monte Carlo (MCMC) algorithm where the proposal density function is a multivariate Gaussian based on a local, second-order Taylor-series expansion of log-density. The mean of the proposal function is the full Newton step in the NewtonRaphson optimization algorithm. Taking advantage of the local, multivariate geometry captured in log-density Hessian allows SNS to be more efficient than univariate samplers, approaching independent sampling as the density function increasingly resembles a multivariate Gaussian. SNS requires the log-density Hessian to be negative-definite everywhere in order to construct a valid proposal function. This property holds, or can be easily checked, for many GLM-like models. When the initial point is far from density peak, running SNS in non-stochastic mode by taking the Newton step - augmented with line search - allows the MCMC chain to converge to high-density areas faster. For high-dimensional problems, partitioning the state space into lower-dimensional subsets, and applying SNS to the subsets within a Gibbs sampling framework can significantly improve the mixing of SNS chains. In addition to the above strategies for improving convergence and mixing, sns offers utilities for diagnostics and visualization, sample-based calculation of Bayesian predictive posterior distributions, numerical differentiation, and log-density validation
    corecore